Skip to main content

Overview

The fetch_market_news.py script retrieves real-time market news for each stock with AI-powered sentiment analysis (positive/negative/neutral). News items are sourced from financial media and automatically categorized by sentiment.

Purpose

Fetches market news including:
  • Latest news headlines and summaries
  • AI-generated sentiment scores (positive/negative/neutral)
  • Publication timestamps
  • Source categories
  • Configurable news limit per stock (default: 50)

API Endpoint

URL
string
required
https://news-live.dhan.co/v2/news/getLiveNews
Method
string
required
POST

Request Payload

{
  "categories": ["ALL"],
  "page_no": 0,
  "limit": 50,
  "first_news_timeStamp": 0,
  "last_news_timeStamp": 0,
  "news_feed_type": "live",
  "stock_list": ["<ISIN>"],
  "entity_id": ""
}

Parameters

categories
array
default:"[\"ALL\"]"
News categories to fetch. Use ["ALL"] for all categories.
page_no
number
default:"0"
Page number for pagination (0-indexed)
limit
number
default:"50"
Number of news items to retrieve per stock. Maximum tested: 100.
first_news_timeStamp
number
default:"0"
Start timestamp filter (0 = no filter)
last_news_timeStamp
number
default:"0"
End timestamp filter (0 = no filter)
news_feed_type
string
default:"live"
Type of news feed (live or historical)
stock_list
array
required
Array of ISIN codes (typically one ISIN per request)
entity_id
string
default:""
Optional entity identifier

Output Files

market_news/{SYMBOL}_news.json
object
Per-stock news data with structure:
{
  "Symbol": "RELIANCE",
  "ISIN": "INE002A01018",
  "News": [
    {
      "Title": "Reliance Q3 results beat estimates",
      "Summary": "Full news text summary...",
      "Sentiment": "positive",
      "PublishDate": 1705334400,
      "Source": "Business News"
    }
  ]
}
Each stock gets up to 50 news items (configurable via NEWS_LIMIT).

Function Signature

def fetch_market_news(item):
    """
    Fetches market news for a single stock.
    
    Args:
        item (dict): Stock object with 'Symbol' and 'ISIN' keys
        
    Returns:
        str: Status - "success", "empty", "rate_limit", or "error"
        
    Process:
        1. Construct payload with stock's ISIN
        2. POST request to news API
        3. Extract and process news items
        4. Save to market_news/{SYMBOL}_news.json
    """

Dependencies

Python Packages
list
  • requests - HTTP client
  • json - JSON processing
  • os - File operations
  • time - Rate limit backoff
  • concurrent.futures.ThreadPoolExecutor - Parallel execution
Local Modules
list
  • pipeline_utils.BASE_DIR - Base directory path
  • pipeline_utils.get_headers() - Standard API headers
Input Files
list
  • master_isin_map.json - ISIN to Symbol mapping

Threading Configuration

MAX_THREADS
number
default:"15"
Number of concurrent threads. Set to 15 to avoid overwhelming the news API.
NEWS_LIMIT
number
default:"50"
Number of news items to fetch per stock (max tested: 100)

Code Example

import json
import requests
import os
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
from pipeline_utils import BASE_DIR, get_headers

INPUT_FILE = os.path.join(BASE_DIR, "master_isin_map.json")
OUTPUT_DIR = os.path.join(BASE_DIR, "market_news")
MAX_THREADS = 15
NEWS_LIMIT = 50

if not os.path.exists(OUTPUT_DIR):
    os.makedirs(OUTPUT_DIR)

def fetch_market_news(item):
    symbol = item.get("Symbol")
    isin = item.get("ISIN")
    
    if not symbol or not isin:
        return None
        
    output_path = os.path.join(OUTPUT_DIR, f"{symbol}_news.json")
    url = "https://news-live.dhan.co/v2/news/getLiveNews"
    
    payload = {
        "categories": ["ALL"],
        "page_no": 0,
        "limit": NEWS_LIMIT,
        "first_news_timeStamp": 0,
        "last_news_timeStamp": 0,
        "news_feed_type": "live",
        "stock_list": [isin],
        "entity_id": ""
    }
    
    headers = get_headers()
    
    try:
        response = requests.post(url, json=payload, headers=headers, timeout=10)
        
        if response.status_code == 200:
            data = response.json()
            news_items = data.get("data", {}).get("latest_news", [])
            
            if news_items:
                processed_news = []
                for news in news_items:
                    news_obj = news.get("news_object", {})
                    processed_news.append({
                        "Title": news_obj.get("title", ""),
                        "Summary": news_obj.get("text", ""),
                        "Sentiment": news_obj.get("overall_sentiment", "neutral"),
                        "PublishDate": news.get("publish_date", 0),
                        "Source": news.get("category", "")
                    })
                
                final_output = {"Symbol": symbol, "ISIN": isin, "News": processed_news}
                
                with open(output_path, "w") as f:
                    json.dump(final_output, f, indent=4)
                
                return "success"
            else:
                return "empty"
        elif response.status_code == 429:
            time.sleep(2)  # Rate limit backoff
            return "rate_limit"
        else:
            return f"http_{response.status_code}"
            
    except Exception as e:
        return "error"

def main():
    with open(INPUT_FILE, "r") as f:
        stock_list = json.load(f)

    total = len(stock_list)
    print(f"Starting Market News Fetch (Limit: {NEWS_LIMIT}) for {total} stocks...")
    
    with ThreadPoolExecutor(max_workers=MAX_THREADS) as executor:
        future_to_stock = {executor.submit(fetch_market_news, item): item["Symbol"] for item in stock_list}
        
        for future in as_completed(future_to_stock):
            result = future.result()
            # Handle result

Usage

python3 fetch_market_news.py

Performance

  • Execution Time: ~4-6 minutes for 2,775 stocks
  • API Calls: 2,775 requests (one per stock)
  • Output: 2,775 individual JSON files in market_news/ directory
  • Concurrency: 15 parallel threads
  • News per Stock: 50 items (configurable)

Rate Limiting

  • Handles HTTP 429 (rate limit) responses automatically
  • Implements 2-second backoff on rate limit detection
  • Returns “rate_limit” status for monitoring
  • 10-second timeout per request

Sentiment Analysis

News items include AI-generated sentiment classification:
  • positive - Bullish/favorable news
  • negative - Bearish/unfavorable news
  • neutral - Non-directional/informational news

Notes

  • Automatically creates market_news/ directory if it doesn’t exist
  • News is fetched fresh on every run (no caching)
  • Sentiment scores are pre-computed by Dhan’s AI engine
  • Maximum tested limit: 100 news items per stock
  • Use page_no parameter for pagination if needed